Using Provenance for Repeatability

نویسندگان

  • Quan Pham
  • Tanu Malik
  • Ian T. Foster
چکیده

We present Provenance-To-Use (PTU), a tool that minimizes computation time during repeatability testing. Authors can use PTU to build a package that includes their software program and a provenance trace of an initial reference execution. Testers can select a subset of the package’s processes for a partial deterministic replay—based, for example, on their compute, memory and I/O utilization as measured during the reference execution. Using the provenance trace, PTU guarantees that events are processed in the same order using the same data from one execution to the next. We show the efficiency of PTU for conducting repeatability testing of workflow-based scientific programs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Path to Virtual Machine Images as First Class Provenance

The scientific community’s increased exposure to cloud computing has led to increased familiarity with the machine virtualization technology that underpins the cloud. Efforts to define and implement provenance for the cloud are under way. In the meantime, however, an orthogonal idea, aimed at quickly facilitating repeatability and curation, has taken shape. This is the idea of using virtual mac...

متن کامل

Provenance, XML, and the Scientific Web

Science is now being revolutionized by the capabilities of distributing computation and human effort over the World Wide Web. This revolution offers dramatic benefits but also poses serious risks due to the fluid nature of digital information. The Web today does not provide adequate repeatability, reliability, accountability and trust guarantees for scientific applications. One important part o...

متن کامل

Provenance Management for SPARQL Updates

During the last few years we have witnessed an explosion in the publication of data in the Web, mainly in the form of Linked Data. Scienti c, corporate or even governmental data are made available for open access and used by applications, individual users and communities. Given the increasing amount and the heterogeneity of this data, it is of crucial importance to be able to track its provenan...

متن کامل

Using Cloud-Aware Provenance to Reproduce Scientific Workflow Execution on Cloud

Provenance has been thought of a mechanism to verify a workflow and to provide workflow reproducibility. This provenance of scientific workflows has been effectively carried out in Grid based scientific workflow systems. However, recent adoption of Cloud-based scientific workflows present an opportunity to investigate the suitability of existing approaches or propose new approaches to collect p...

متن کامل

Automatically Tracking Metadata and Provenance of Machine Learning Experiments

We present a lightweight system to extract, store and manage metadata and provenance information of common artifacts in machine learning (ML) experiments: datasets, models, predictions, evaluations and training runs. Our system accelerates users in their ML workflow, and provides a basis for comparability and repeatability of ML experiments. We achieve this by tracking the lineage of produced a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013